<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
                "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
  <title>Description of kmeans</title>
  <meta name="keywords" content="kmeans">
  <meta name="description" content="KMEANS	Trains a k means cluster model.">
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  <meta name="generator" content="m2html &copy; 2003 Guillaume Flandin">
  <meta name="robots" content="index, follow">
  <link type="text/css" rel="stylesheet" href="../../m2html.css">
</head>
<body>
<a name="_top"></a>
<div><a href="../../menu.html">Home</a> &gt;  <a href="#">ReBEL-0.2.7</a> &gt; <a href="#">netlab</a> &gt; kmeans.m</div>

<!--<table width="100%"><tr><td align="left"><a href="../../menu.html"><img alt="<" border="0" src="../../left.png">&nbsp;Master index</a></td>
<td align="right"><a href="menu.html">Index for .\ReBEL-0.2.7\netlab&nbsp;<img alt=">" border="0" src="../../right.png"></a></td></tr></table>-->

<h1>kmeans
</h1>

<h2><a name="_name"></a>PURPOSE <a href="#_top"><img alt="^" border="0" src="../../up.png"></a></h2>
<div class="box"><strong>KMEANS	Trains a k means cluster model.</strong></div>

<h2><a name="_synopsis"></a>SYNOPSIS <a href="#_top"><img alt="^" border="0" src="../../up.png"></a></h2>
<div class="box"><strong>function [centres, options, post, errlog] = kmeans(centres, data, options) </strong></div>

<h2><a name="_description"></a>DESCRIPTION <a href="#_top"><img alt="^" border="0" src="../../up.png"></a></h2>
<div class="fragment"><pre class="comment">KMEANS    Trains a k means cluster model.

    Description
     CENTRES = KMEANS(CENTRES, DATA, OPTIONS) uses the batch K-means
    algorithm to set the centres of a cluster model. The matrix DATA
    represents the data which is being clustered, with each row
    corresponding to a vector. The sum of squares error function is used.
    The point at which a local minimum is achieved is returned as
    CENTRES.  The error value at that point is returned in OPTIONS(8).

    [CENTRES, OPTIONS, POST, ERRLOG] = KMEANS(CENTRES, DATA, OPTIONS)
    also returns the cluster number (in a one-of-N encoding) for each
    data point in POST and a log of the error values after each cycle in
    ERRLOG.    The optional parameters have the following
    interpretations.

    OPTIONS(1) is set to 1 to display error values; also logs error
    values in the return argument ERRLOG. If OPTIONS(1) is set to 0, then
    only warning messages are displayed.  If OPTIONS(1) is -1, then
    nothing is displayed.

    OPTIONS(2) is a measure of the absolute precision required for the
    value of CENTRES at the solution.  If the absolute difference between
    the values of CENTRES between two successive steps is less than
    OPTIONS(2), then this condition is satisfied.

    OPTIONS(3) is a measure of the precision required of the error
    function at the solution.  If the absolute difference between the
    error functions between two successive steps is less than OPTIONS(3),
    then this condition is satisfied. Both this and the previous
    condition must be satisfied for termination.

    OPTIONS(14) is the maximum number of iterations; default 100.

    See also
    <a href="gmminit.html" class="code" title="function mix = gmminit(mix, x, options)">GMMINIT</a>, <a href="gmmem.html" class="code" title="function [mix, options, errlog] = gmmem(mix, x, options)">GMMEM</a></pre></div>

<!-- crossreference -->
<h2><a name="_cross"></a>CROSS-REFERENCE INFORMATION <a href="#_top"><img alt="^" border="0" src="../../up.png"></a></h2>
This function calls:
<ul style="list-style-image:url(../../matlabicon.gif)">
<li><a href="dist2.html" class="code" title="function n2 = dist2(x, c)">dist2</a>	DIST2	Calculates squared distance between two sets of points.</li><li><a href="maxitmess.html" class="code" title="function s = maxitmess()">maxitmess</a>	MAXITMESS Create a standard error message when training reaches max. iterations.</li></ul>
This function is called by:
<ul style="list-style-image:url(../../matlabicon.gif)">
<li><a href="../.././ReBEL-0.2.7/core/gmminitialize.html" class="code" title="function gmmDS = gmminitialize(gmmDS, X, maxI)">gmminitialize</a>	GMMINITIALIZE  Initialises Gaussian mixture model (GMM) from data</li><li><a href="demkmn1.html" class="code" title="">demkmn1</a>	DEMKMEAN Demonstrate simple clustering model trained with K-means.</li><li><a href="gmminit.html" class="code" title="function mix = gmminit(mix, x, options)">gmminit</a>	GMMINIT Initialises Gaussian mixture model from data</li></ul>
<!-- crossreference -->


<h2><a name="_source"></a>SOURCE CODE <a href="#_top"><img alt="^" border="0" src="../../up.png"></a></h2>
<div class="fragment"><pre>0001 <a name="_sub0" href="#_subfunctions" class="code">function [centres, options, post, errlog] = kmeans(centres, data, options)</a>
0002 <span class="comment">%KMEANS    Trains a k means cluster model.</span>
0003 <span class="comment">%</span>
0004 <span class="comment">%    Description</span>
0005 <span class="comment">%     CENTRES = KMEANS(CENTRES, DATA, OPTIONS) uses the batch K-means</span>
0006 <span class="comment">%    algorithm to set the centres of a cluster model. The matrix DATA</span>
0007 <span class="comment">%    represents the data which is being clustered, with each row</span>
0008 <span class="comment">%    corresponding to a vector. The sum of squares error function is used.</span>
0009 <span class="comment">%    The point at which a local minimum is achieved is returned as</span>
0010 <span class="comment">%    CENTRES.  The error value at that point is returned in OPTIONS(8).</span>
0011 <span class="comment">%</span>
0012 <span class="comment">%    [CENTRES, OPTIONS, POST, ERRLOG] = KMEANS(CENTRES, DATA, OPTIONS)</span>
0013 <span class="comment">%    also returns the cluster number (in a one-of-N encoding) for each</span>
0014 <span class="comment">%    data point in POST and a log of the error values after each cycle in</span>
0015 <span class="comment">%    ERRLOG.    The optional parameters have the following</span>
0016 <span class="comment">%    interpretations.</span>
0017 <span class="comment">%</span>
0018 <span class="comment">%    OPTIONS(1) is set to 1 to display error values; also logs error</span>
0019 <span class="comment">%    values in the return argument ERRLOG. If OPTIONS(1) is set to 0, then</span>
0020 <span class="comment">%    only warning messages are displayed.  If OPTIONS(1) is -1, then</span>
0021 <span class="comment">%    nothing is displayed.</span>
0022 <span class="comment">%</span>
0023 <span class="comment">%    OPTIONS(2) is a measure of the absolute precision required for the</span>
0024 <span class="comment">%    value of CENTRES at the solution.  If the absolute difference between</span>
0025 <span class="comment">%    the values of CENTRES between two successive steps is less than</span>
0026 <span class="comment">%    OPTIONS(2), then this condition is satisfied.</span>
0027 <span class="comment">%</span>
0028 <span class="comment">%    OPTIONS(3) is a measure of the precision required of the error</span>
0029 <span class="comment">%    function at the solution.  If the absolute difference between the</span>
0030 <span class="comment">%    error functions between two successive steps is less than OPTIONS(3),</span>
0031 <span class="comment">%    then this condition is satisfied. Both this and the previous</span>
0032 <span class="comment">%    condition must be satisfied for termination.</span>
0033 <span class="comment">%</span>
0034 <span class="comment">%    OPTIONS(14) is the maximum number of iterations; default 100.</span>
0035 <span class="comment">%</span>
0036 <span class="comment">%    See also</span>
0037 <span class="comment">%    GMMINIT, GMMEM</span>
0038 <span class="comment">%</span>
0039 
0040 <span class="comment">%    Copyright (c) Ian T Nabney (1996-2001)</span>
0041 
0042 [ndata, data_dim] = size(data);
0043 [ncentres, dim] = size(centres);
0044 
0045 <span class="keyword">if</span> dim ~= data_dim
0046   error(<span class="string">'Data dimension does not match dimension of centres'</span>)
0047 <span class="keyword">end</span>
0048 
0049 <span class="keyword">if</span> (ncentres &gt; ndata)
0050   error(<span class="string">'More centres than data'</span>)
0051 <span class="keyword">end</span>
0052 
0053 <span class="comment">% Sort out the options</span>
0054 <span class="keyword">if</span> (options(14))
0055   niters = options(14);
0056 <span class="keyword">else</span>
0057   niters = 100;
0058 <span class="keyword">end</span>
0059 
0060 store = 0;
0061 <span class="keyword">if</span> (nargout &gt; 3)
0062   store = 1;
0063   errlog = zeros(1, niters);
0064 <span class="keyword">end</span>
0065 
0066 <span class="comment">% Check if centres and posteriors need to be initialised from data</span>
0067 <span class="keyword">if</span> (options(5) == 1)
0068   <span class="comment">% Do the initialisation</span>
0069   perm = randperm(ndata);
0070   perm = perm(1:ncentres);
0071 
0072   <span class="comment">% Assign first ncentres (permuted) data points as centres</span>
0073   centres = data(perm, :);
0074 <span class="keyword">end</span>
0075 <span class="comment">% Matrix to make unit vectors easy to construct</span>
0076 id = eye(ncentres);
0077 
0078 <span class="comment">% Main loop of algorithm</span>
0079 <span class="keyword">for</span> n = 1:niters
0080 
0081   <span class="comment">% Save old centres to check for termination</span>
0082   old_centres = centres;
0083   
0084   <span class="comment">% Calculate posteriors based on existing centres</span>
0085   d2 = <a href="dist2.html" class="code" title="function n2 = dist2(x, c)">dist2</a>(data, centres);
0086   <span class="comment">% Assign each point to nearest centre</span>
0087   [minvals, index] = min(d2', [], 1);
0088   post = id(index,:);
0089 
0090   num_points = sum(post, 1);
0091   <span class="comment">% Adjust the centres based on new posteriors</span>
0092   <span class="keyword">for</span> j = 1:ncentres
0093     <span class="keyword">if</span> (num_points(j) &gt; 0)
0094       centres(j,:) = sum(data(find(post(:,j)),:), 1)/num_points(j);
0095     <span class="keyword">end</span>
0096   <span class="keyword">end</span>
0097 
0098   <span class="comment">% Error value is total squared distance from cluster centres</span>
0099   e = sum(minvals);
0100   <span class="keyword">if</span> store
0101     errlog(n) = e;
0102   <span class="keyword">end</span>
0103   <span class="keyword">if</span> options(1) &gt; 0
0104     fprintf(1, <span class="string">'Cycle %4d  Error %11.6f\n'</span>, n, e);
0105   <span class="keyword">end</span>
0106 
0107   <span class="keyword">if</span> n &gt; 1
0108     <span class="comment">% Test for termination</span>
0109     <span class="keyword">if</span> max(max(abs(centres - old_centres))) &lt; options(2) &amp; <span class="keyword">...</span>
0110         abs(old_e - e) &lt; options(3)
0111       options(8) = e;
0112       <span class="keyword">return</span>;
0113     <span class="keyword">end</span>
0114   <span class="keyword">end</span>
0115   old_e = e;
0116 <span class="keyword">end</span>
0117 
0118 <span class="comment">% If we get here, then we haven't terminated in the given number of</span>
0119 <span class="comment">% iterations.</span>
0120 options(8) = e;
0121 <span class="keyword">if</span> (options(1) &gt;= 0)
0122   disp(<a href="maxitmess.html" class="code" title="function s = maxitmess()">maxitmess</a>);
0123 <span class="keyword">end</span>
0124</pre></div>
<hr><address>Generated on Tue 26-Sep-2006 10:36:21 by <strong><a href="http://www.artefact.tk/software/matlab/m2html/">m2html</a></strong> &copy; 2003</address>
</body>
</html>